Lamyaa BOUZBIBA Alexandre POUPEAU Eloise JULIEN

1. geneticData

NAm2 = read.table("NAm2.txt", header = TRUE)
# unique is used to get the name of all population / ethnics without duplicates
names=unique(NAm2$Pop)
npop=length(names)
# coordinates for each pop
coord=unique(NAm2[, c("Pop","long","lat")])
colPalette=rep(c("black","red","cyan","orange","brown","blue","pink","purple","darkgreen"),3)
# type 16 = circle / type 15 = square / type 25 = triangle
pch=rep(c(16,15,25),each=9)
# the display works because for each population we have the latitude and the longitude
plot(coord[, c("long","lat")],pch=pch,col=colPalette,asp=1)
# asp allows to have the correct ratio between  axis  longitude and latitude
# Then the map is not deformed  
legend("bottomleft",legend=names,col=colPalette,lty=-1,pch=pch,cex=.75,ncol=2,lwd=2)
library(maps);map("world",add=T)

The description of the script is in the commentaries above.

2. Regression

# turn the matrix into a geneticData frame
geneticData = NAm2[,-c(1:7)]
geneticDataFrame = data.frame(geneticData)
# creation of the linear model
reg = lm(formula = long ~ . , data = geneticDataFrame)
# uncomment that following part to see the summary of the regression
# summary(reg)

All values are NA (Not Available) because we have too much predictors. The \(X\) matrix is cannot be reversed so we have to use the PCA method to counter that problem.

3.PCA Principal Component Analysis

a)

The main goal of the PCA is to reduce the number of predictors. In order to achieve that, we have to extract the most important components of the geneticData set. These new variables are a linear combinaison of the original ones. The number of principal components is smaller than the number of original variables. The aim is to maximize the variation of the geneticData set.

b)

# install.packages("factoextra")
library(factoextra)
Loading required package: ggplot2
Welcome! Related Books: `Practical Guide To Cluster Analysis in R` at https://goo.gl/13EFCZ
allGenes <- geneticDataFrame[, -1]
pcaNAm2 <- prcomp(allGenes, scale = FALSE)
fviz_eig(pcaNAm2)

vectProportionOfVariance <- pcaNAm2.pca$sdev^2/sum(pcaNAm2.pca$sdev^2)
# summary(pcaNAm2.pca)

It is better to use scale = TRUE by convention because we get a better representation of the percentage of explained values.

FALSE : 494 = linear combianison of the 5000 genes We notice that we get only 494 principal components instead of approximately 5000 which correspond to the number of genes. This can be explained because “” in the covariance matrix the eigenvalues from 495 to ~5000 are equals to zero. In reality, we only select the genes that best explain the model up to 500. If we had to really find the “best-explaining”, which means finding the most explicative genes from 1 to 5000 we would have to test all the combinaisons of 494 genes.

c)

caxes=c(1,2)
plot(pcaNAm2$x[,caxes],col="white")
for (i in 1:npop) {
  # print(names[i])
  lines(pcaNAm2$x[which(NAm2[,3]==names[i]),caxes],
  type="p",col=colPalette[i],pch=pch[i])
}
legend("topleft",legend=names,col=colPalette,lty=-
1,pch=pch,cex=.65,ncol=3,lwd=2)

caxes=c(5,6)
plot(pcaNAm2$x[,caxes],col="white")
for (i in 1:npop) {
  # print(names[i])
  lines(pcaNAm2$x[which(NAm2[,3]==names[i]),caxes],
  type="p",col=colPalette[i],pch=pch[i])
}
legend("topright",legend=names,col=colPalette,lty=-
1,pch=pch,cex=.65,ncol=3,lwd=2)

The Ache population is well identified by both components. The Surui can also be identified by both components but more especially by the PC1.

The Karitiana population is well identified by the PC5. The Pima population is well identified by the PC6.

d)

inertia = vectProportionOfVariance[1] + vectProportionOfVariance[2]
sprintf("The inertia is equal to %f.", inertia)
[1] "The inertia is equal to 0.033979."

According to what we have seen until now, we have two ideas.

The first idea is that we have seen that we can distinguish two population using two principal components. Therefore, as there are 27 different populations, we could deduce that we would need 14 pairs of principal components.

However, we can ask ourselves if we can really each time distinguish two different populations using two principal components. It might not be the case because we have just tested it with two pairs of principal components. Thus, we could think about an alternative way, which is adding the variance of the first principal components until the sum is equal to a certain level. This level could be 25% or 50%. This last one means using a lot of principal components.

numberOfPCs <- function(vectorProportionOfVariance, percentage) {
  tmp = 0.0
  i = 1
  while (tmp < percentage) {
    tmp = tmp + vectProportionOfVariance[i]
    i = i + 1
  }
  return(i)
}
nbrPC25 <- numberOfPCs(vectProportionOfVariance, 0.25)
nbrPC50 <- numberOfPCs(vectProportionOfVariance, 0.5)
nbrPC75 <- numberOfPCs(vectProportionOfVariance, 0.75)
nbrPC99 <- numberOfPCs(vectProportionOfVariance, 0.99)
sprintf("Number of principal components needed in order to cover at least 25 percent of variance : %i", nbrPC25)
[1] "Number of principal components needed in order to cover at least 25 percent of variance : 51"
sprintf("Number of principal components needed in order to cover at least 50 percent of variance : %i", nbrPC50)
[1] "Number of principal components needed in order to cover at least 50 percent of variance : 140"
sprintf("Number of principal components needed in order to cover at least 75 percent of variance : %i", nbrPC75)
[1] "Number of principal components needed in order to cover at least 75 percent of variance : 268"
sprintf("Number of principal components needed in order to cover at least 99 percent of variance : %i", nbrPC99)
[1] "Number of principal components needed in order to cover at least 99 percent of variance : 477"

With the test we have established just before we can see that \(1/10\) of the data contain \(25%\) of the variance : So it seems like a good idea to use only \(50\) PC to get most of the information within least data. So we might use that number of PC in order to represent genetic markers. The other numbers give a idea of the number needed if we want to cover more information.

4. PCR Principal Components Regression

a)

# pcaNAm2$x are scores : size 494 * 494 matrix. We take all the ligns but only 250 columns
first250PCs <- pcaNAm2$x[, c(1:250)]
# size 494 * 251 => first column become the longitude
long <- c(NAm2$long, first250PCs)
longMatrix <- matrix(data = long, nrow = 494, ncol = 251)
dataFrameLong <- data.frame(longMatrix)
#dataFrameLong
lmlong = lm(formula = X1 ~ . , data = dataFrameLong)
# size 494 * 251 => first column become the latitude
lat <- c(NAm2$lat, first250PCs)
latMatrix <- matrix(data = lat, nrow = 494, ncol = 251)
dataFrameLat <- data.frame(latMatrix)
dataFrameLat
lmlat = lm(formula = X1 ~ . , data = dataFrameLat)
plot(lmlong$fitted.values,lmlat$fitted.values,col="white")
for (i in 1:npop) {
  #print(names[i])
  lines(lmlong$fitted.values[which(NAm2[,3]==names[i])],
        lmlat$fitted.values[which(NAm2[,3]==names[i])],
        type="p",col=colPalette[i],pch=pch[i])
}
legend("bottomleft",legend=names,col=colPalette,lty=-1,pch=pch,cex=.75,ncol=3,lwd=2)
library(maps);map("world",add=T)

If we compare it to the map from the question 1, we can clearly say that this map represent very optimisticaly places where population live. It looks quite messy if we just take a look around Panama. However if we examine the estimated places of the Chipewyan, Cree, Ojibwa, Pima and Huiliche populations, we can very distinctly see that the estimation is pretty accurate.

b)

# import the library needed to use rdist.earth
library("fields")
# select the theory latitude and longitude of populations
theoryPosition <- matrix(data = c(NAm2$long, NAm2$lat), nrow = 494, ncol = 2)
#theoryPosition
# vector used to stock the mean of distance for each population
stock <- c()
tmpIndex <- 0
for (i in 1:npop) {
  # vector long estimated for population n°i
  tmpLong <- lmlong$fitted.values[which(NAm2[,3]==names[i])]
  
  # vector lat estimated for population n°i
  tmpLat <- lmlat$fitted.values[which(NAm2[,3]==names[i])]
  
  # matrix of the estimated positions
  estimatedPosition <- matrix(data = c(tmpLong, tmpLat), ncol = 2)
  #print(estimatedPosition)
  
  len <- length(estimatedPosition)/2
  
  # by making unique we assume that two estimated coordinates will never be at the same exact position
  diff <- unique(rdist.earth(x1 = theoryPosition[tmpIndex:(tmpIndex+len), ], x2 = estimatedPosition, miles = F, R = NULL))
  
  # stock the mean of distance from the theorical position in kilometers
  stock[i] <- mean(diff)
  
  tmpIndex <- tmpIndex + len
}
plot(c(1:npop), stock, main = "Mean difference in kilometers between theorical and estimated positions",
     xlab = "N° associated to the population", 
     ylab = "Mean of the distance in kilometers")
linearTest <- lm(stock ~ c(1:npop))
abline(linearTest)

Thanks to the previous graph, we can say that the estimated positions are on average approximately \(1100\) kilometers away from the theorical position. It might seem a lot at first sight because it sounds huge if we just have a human sensible approach. Nonetheless it is not really a lot in reality if we size to the scale of the earth. Indeed, we could just think about the fact that the earth’s circonference is equal to \(40075\) kilometers. If we draw a circle around the theorical position for each population, on average, the estimation of the position will be contained inside the circle.

5) PCR and Cross-Validation

a)

LS0tCnRpdGxlOiAiVFAyIGdlbmV0aWNEYXRhIE1pbmluZyAtIFBDQS1yZWdyZXNzaW9uIGluIGdlbmV0aWNzIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpMYW15YWEgQk9VWkJJQkEKQWxleGFuZHJlIFBPVVBFQVUKRWxvaXNlIEpVTElFTgoKIyMjIDEuIGdlbmV0aWNEYXRhCgpgYGB7cn0KCk5BbTIgPSByZWFkLnRhYmxlKCJOQW0yLnR4dCIsIGhlYWRlciA9IFRSVUUpCgojIHVuaXF1ZSBpcyB1c2VkIHRvIGdldCB0aGUgbmFtZSBvZiBhbGwgcG9wdWxhdGlvbiAvIGV0aG5pY3Mgd2l0aG91dCBkdXBsaWNhdGVzCm5hbWVzPXVuaXF1ZShOQW0yJFBvcCkKbnBvcD1sZW5ndGgobmFtZXMpCgojIGNvb3JkaW5hdGVzIGZvciBlYWNoIHBvcApjb29yZD11bmlxdWUoTkFtMlssIGMoIlBvcCIsImxvbmciLCJsYXQiKV0pCmNvbFBhbGV0dGU9cmVwKGMoImJsYWNrIiwicmVkIiwiY3lhbiIsIm9yYW5nZSIsImJyb3duIiwiYmx1ZSIsInBpbmsiLCJwdXJwbGUiLCJkYXJrZ3JlZW4iKSwzKQoKIyB0eXBlIDE2ID0gY2lyY2xlIC8gdHlwZSAxNSA9IHNxdWFyZSAvIHR5cGUgMjUgPSB0cmlhbmdsZQpwY2g9cmVwKGMoMTYsMTUsMjUpLGVhY2g9OSkKCiMgdGhlIGRpc3BsYXkgd29ya3MgYmVjYXVzZSBmb3IgZWFjaCBwb3B1bGF0aW9uIHdlIGhhdmUgdGhlIGxhdGl0dWRlIGFuZCB0aGUgbG9uZ2l0dWRlCnBsb3QoY29vcmRbLCBjKCJsb25nIiwibGF0IildLHBjaD1wY2gsY29sPWNvbFBhbGV0dGUsYXNwPTEpCgojIGFzcCBhbGxvd3MgdG8gaGF2ZSB0aGUgY29ycmVjdCByYXRpbyBiZXR3ZWVuICBheGlzICBsb25naXR1ZGUgYW5kIGxhdGl0dWRlCiMgVGhlbiB0aGUgbWFwIGlzIG5vdCBkZWZvcm1lZCAgCmxlZ2VuZCgiYm90dG9tbGVmdCIsbGVnZW5kPW5hbWVzLGNvbD1jb2xQYWxldHRlLGx0eT0tMSxwY2g9cGNoLGNleD0uNzUsbmNvbD0yLGx3ZD0yKQoKbGlicmFyeShtYXBzKTttYXAoIndvcmxkIixhZGQ9VCkKCgpgYGAKClRoZSBkZXNjcmlwdGlvbiBvZiB0aGUgc2NyaXB0IGlzIGluIHRoZSBjb21tZW50YXJpZXMgYWJvdmUuCgojIyMgMi4gUmVncmVzc2lvbgoKYGBge3J9CgojIHR1cm4gdGhlIG1hdHJpeCBpbnRvIGEgZ2VuZXRpY0RhdGEgZnJhbWUKZ2VuZXRpY0RhdGEgPSBOQW0yWywtYygxOjcpXQoKZ2VuZXRpY0RhdGFGcmFtZSA9IGRhdGEuZnJhbWUoZ2VuZXRpY0RhdGEpCgojIGNyZWF0aW9uIG9mIHRoZSBsaW5lYXIgbW9kZWwKcmVnID0gbG0oZm9ybXVsYSA9IGxvbmcgfiAuICwgZGF0YSA9IGdlbmV0aWNEYXRhRnJhbWUpCgojIHVuY29tbWVudCB0aGF0IGZvbGxvd2luZyBwYXJ0IHRvIHNlZSB0aGUgc3VtbWFyeSBvZiB0aGUgcmVncmVzc2lvbgojIHN1bW1hcnkocmVnKQoKYGBgCgpBbGwgdmFsdWVzIGFyZSBOQSAoTm90IEF2YWlsYWJsZSkgYmVjYXVzZSB3ZSBoYXZlIHRvbyBtdWNoIHByZWRpY3RvcnMuIFRoZSAkWCQgbWF0cml4IGlzIGNhbm5vdCBiZSByZXZlcnNlZCBzbyB3ZSBoYXZlIHRvIHVzZSB0aGUgUENBIG1ldGhvZCB0byBjb3VudGVyIHRoYXQgcHJvYmxlbS4KCiMjIyAzLlBDQSBQcmluY2lwYWwgQ29tcG9uZW50IEFuYWx5c2lzCgojIyMjIGEpIAoKVGhlIG1haW4gZ29hbCBvZiB0aGUgUENBIGlzIHRvIHJlZHVjZSB0aGUgbnVtYmVyIG9mIHByZWRpY3RvcnMuIEluIG9yZGVyIHRvIGFjaGlldmUgdGhhdCwgd2UgaGF2ZSB0byBleHRyYWN0IHRoZSBtb3N0IGltcG9ydGFudCBjb21wb25lbnRzIG9mIHRoZSBnZW5ldGljRGF0YSBzZXQuIFRoZXNlIG5ldyB2YXJpYWJsZXMgYXJlIGEgbGluZWFyIGNvbWJpbmFpc29uIG9mIHRoZSBvcmlnaW5hbCBvbmVzLiBUaGUgbnVtYmVyIG9mIHByaW5jaXBhbCBjb21wb25lbnRzIGlzIHNtYWxsZXIgdGhhbiB0aGUgbnVtYmVyIG9mIG9yaWdpbmFsIHZhcmlhYmxlcy4gVGhlIGFpbSBpcyB0byBtYXhpbWl6ZSB0aGUgdmFyaWF0aW9uIG9mIHRoZSBnZW5ldGljRGF0YSBzZXQuCgojIyMjIGIpCgpgYGB7cn0KCiMgaW5zdGFsbC5wYWNrYWdlcygiZmFjdG9leHRyYSIpCgpsaWJyYXJ5KGZhY3RvZXh0cmEpCgphbGxHZW5lcyA8LSBnZW5ldGljRGF0YUZyYW1lWywgLTFdCgpwY2FOQW0yIDwtIHByY29tcChhbGxHZW5lcywgc2NhbGUgPSBGQUxTRSkKCmZ2aXpfZWlnKHBjYU5BbTIpCgp2ZWN0UHJvcG9ydGlvbk9mVmFyaWFuY2UgPC0gcGNhTkFtMi5wY2Ekc2Rldl4yL3N1bShwY2FOQW0yLnBjYSRzZGV2XjIpCgojIHN1bW1hcnkocGNhTkFtMi5wY2EpCgoKYGBgCgpJdCBpcyBiZXR0ZXIgdG8gdXNlIHNjYWxlID0gVFJVRSBieSBjb252ZW50aW9uIGJlY2F1c2Ugd2UgZ2V0IGEgYmV0dGVyIHJlcHJlc2VudGF0aW9uIG9mIHRoZSBwZXJjZW50YWdlIG9mIGV4cGxhaW5lZCB2YWx1ZXMuCgpGQUxTRSA6IDQ5NCA9IGxpbmVhciBjb21iaWFuaXNvbiBvZiB0aGUgNTAwMCBnZW5lcwpXZSBub3RpY2UgdGhhdCB3ZSBnZXQgb25seSA0OTQgcHJpbmNpcGFsIGNvbXBvbmVudHMgaW5zdGVhZCBvZiBhcHByb3hpbWF0ZWx5IDUwMDAgd2hpY2ggY29ycmVzcG9uZCB0byB0aGUgbnVtYmVyIG9mIGdlbmVzLiBUaGlzIGNhbiBiZSBleHBsYWluZWQgYmVjYXVzZSAiIiBpbiB0aGUgY292YXJpYW5jZSBtYXRyaXggdGhlIGVpZ2VudmFsdWVzIGZyb20gNDk1IHRvIH41MDAwIGFyZSBlcXVhbHMgdG8gemVyby4gSW4gcmVhbGl0eSwgd2Ugb25seSBzZWxlY3QgdGhlIGdlbmVzIHRoYXQgYmVzdCBleHBsYWluIHRoZSBtb2RlbCB1cCB0byA1MDAuIElmIHdlIGhhZCB0byByZWFsbHkgZmluZCB0aGUgImJlc3QtZXhwbGFpbmluZyIsIHdoaWNoIG1lYW5zIGZpbmRpbmcgdGhlIG1vc3QgZXhwbGljYXRpdmUgZ2VuZXMgZnJvbSAxIHRvIDUwMDAgd2Ugd291bGQgaGF2ZSB0byB0ZXN0IGFsbCB0aGUgY29tYmluYWlzb25zIG9mIDQ5NCBnZW5lcy4gICAKCiMjIyMgYykKCmBgYHtyfQoKY2F4ZXM9YygxLDIpCnBsb3QocGNhTkFtMiR4WyxjYXhlc10sY29sPSJ3aGl0ZSIpCgpmb3IgKGkgaW4gMTpucG9wKSB7CiAgIyBwcmludChuYW1lc1tpXSkKICBsaW5lcyhwY2FOQW0yJHhbd2hpY2goTkFtMlssM109PW5hbWVzW2ldKSxjYXhlc10sCiAgdHlwZT0icCIsY29sPWNvbFBhbGV0dGVbaV0scGNoPXBjaFtpXSkKfQpsZWdlbmQoInRvcGxlZnQiLGxlZ2VuZD1uYW1lcyxjb2w9Y29sUGFsZXR0ZSxsdHk9LQoxLHBjaD1wY2gsY2V4PS42NSxuY29sPTMsbHdkPTIpCgpjYXhlcz1jKDUsNikKcGxvdChwY2FOQW0yJHhbLGNheGVzXSxjb2w9IndoaXRlIikKCmZvciAoaSBpbiAxOm5wb3ApIHsKICAjIHByaW50KG5hbWVzW2ldKQogIGxpbmVzKHBjYU5BbTIkeFt3aGljaChOQW0yWywzXT09bmFtZXNbaV0pLGNheGVzXSwKICB0eXBlPSJwIixjb2w9Y29sUGFsZXR0ZVtpXSxwY2g9cGNoW2ldKQp9CmxlZ2VuZCgidG9wcmlnaHQiLGxlZ2VuZD1uYW1lcyxjb2w9Y29sUGFsZXR0ZSxsdHk9LQoxLHBjaD1wY2gsY2V4PS42NSxuY29sPTMsbHdkPTIpCgpgYGAKClRoZSBBY2hlIHBvcHVsYXRpb24gaXMgd2VsbCBpZGVudGlmaWVkIGJ5IGJvdGggY29tcG9uZW50cy4gVGhlIFN1cnVpIGNhbiBhbHNvIGJlIGlkZW50aWZpZWQgYnkgYm90aCBjb21wb25lbnRzIGJ1dCBtb3JlIGVzcGVjaWFsbHkgYnkgdGhlIFBDMS4KClRoZSBLYXJpdGlhbmEgcG9wdWxhdGlvbiBpcyB3ZWxsIGlkZW50aWZpZWQgYnkgdGhlIFBDNS4gVGhlIFBpbWEgcG9wdWxhdGlvbiBpcyB3ZWxsIGlkZW50aWZpZWQgYnkgdGhlIFBDNi4KCiMjIyMgZCkKCgpgYGB7cn0KCmluZXJ0aWEgPSB2ZWN0UHJvcG9ydGlvbk9mVmFyaWFuY2VbMV0gKyB2ZWN0UHJvcG9ydGlvbk9mVmFyaWFuY2VbMl0KCnNwcmludGYoIlRoZSBpbmVydGlhIGlzIGVxdWFsIHRvICVmLiIsIGluZXJ0aWEpCgpgYGAKCgpBY2NvcmRpbmcgdG8gd2hhdCB3ZSBoYXZlIHNlZW4gdW50aWwgbm93LCB3ZSBoYXZlIHR3byBpZGVhcy4KClRoZSBmaXJzdCBpZGVhIGlzIHRoYXQgd2UgaGF2ZSBzZWVuIHRoYXQgd2UgY2FuIGRpc3Rpbmd1aXNoIHR3byBwb3B1bGF0aW9uIHVzaW5nIHR3byBwcmluY2lwYWwgY29tcG9uZW50cy4gVGhlcmVmb3JlLCBhcyB0aGVyZSBhcmUgMjcgZGlmZmVyZW50IHBvcHVsYXRpb25zLCB3ZSBjb3VsZCBkZWR1Y2UgdGhhdCB3ZSB3b3VsZCBuZWVkIDE0IHBhaXJzIG9mIHByaW5jaXBhbCBjb21wb25lbnRzLiAKCkhvd2V2ZXIsIHdlIGNhbiBhc2sgb3Vyc2VsdmVzIGlmIHdlIGNhbiByZWFsbHkgZWFjaCB0aW1lIGRpc3Rpbmd1aXNoIHR3byBkaWZmZXJlbnQgcG9wdWxhdGlvbnMgdXNpbmcgdHdvIHByaW5jaXBhbCBjb21wb25lbnRzLiBJdCBtaWdodCBub3QgYmUgdGhlIGNhc2UgYmVjYXVzZSB3ZSBoYXZlIGp1c3QgdGVzdGVkIGl0IHdpdGggdHdvIHBhaXJzIG9mIHByaW5jaXBhbCBjb21wb25lbnRzLiBUaHVzLCB3ZSBjb3VsZCB0aGluayBhYm91dCBhbiBhbHRlcm5hdGl2ZSB3YXksIHdoaWNoIGlzIGFkZGluZyB0aGUgdmFyaWFuY2Ugb2YgdGhlIGZpcnN0IHByaW5jaXBhbCBjb21wb25lbnRzIHVudGlsIHRoZSBzdW0gaXMgZXF1YWwgdG8gYSBjZXJ0YWluIGxldmVsLiBUaGlzIGxldmVsIGNvdWxkIGJlIDI1JSBvciA1MCUuIFRoaXMgbGFzdCBvbmUgbWVhbnMgdXNpbmcgYSBsb3Qgb2YgcHJpbmNpcGFsIGNvbXBvbmVudHMuIAoKYGBge3J9CgpudW1iZXJPZlBDcyA8LSBmdW5jdGlvbih2ZWN0b3JQcm9wb3J0aW9uT2ZWYXJpYW5jZSwgcGVyY2VudGFnZSkgewogIHRtcCA9IDAuMAogIGkgPSAxCiAgd2hpbGUgKHRtcCA8IHBlcmNlbnRhZ2UpIHsKICAgIHRtcCA9IHRtcCArIHZlY3RQcm9wb3J0aW9uT2ZWYXJpYW5jZVtpXQogICAgaSA9IGkgKyAxCiAgfQogIHJldHVybihpKQp9CgpuYnJQQzI1IDwtIG51bWJlck9mUENzKHZlY3RQcm9wb3J0aW9uT2ZWYXJpYW5jZSwgMC4yNSkKbmJyUEM1MCA8LSBudW1iZXJPZlBDcyh2ZWN0UHJvcG9ydGlvbk9mVmFyaWFuY2UsIDAuNSkKbmJyUEM3NSA8LSBudW1iZXJPZlBDcyh2ZWN0UHJvcG9ydGlvbk9mVmFyaWFuY2UsIDAuNzUpCm5iclBDOTkgPC0gbnVtYmVyT2ZQQ3ModmVjdFByb3BvcnRpb25PZlZhcmlhbmNlLCAwLjk5KQoKc3ByaW50ZigiTnVtYmVyIG9mIHByaW5jaXBhbCBjb21wb25lbnRzIG5lZWRlZCBpbiBvcmRlciB0byBjb3ZlciBhdCBsZWFzdCAyNSBwZXJjZW50IG9mIHZhcmlhbmNlIDogJWkiLCBuYnJQQzI1KQpzcHJpbnRmKCJOdW1iZXIgb2YgcHJpbmNpcGFsIGNvbXBvbmVudHMgbmVlZGVkIGluIG9yZGVyIHRvIGNvdmVyIGF0IGxlYXN0IDUwIHBlcmNlbnQgb2YgdmFyaWFuY2UgOiAlaSIsIG5iclBDNTApCnNwcmludGYoIk51bWJlciBvZiBwcmluY2lwYWwgY29tcG9uZW50cyBuZWVkZWQgaW4gb3JkZXIgdG8gY292ZXIgYXQgbGVhc3QgNzUgcGVyY2VudCBvZiB2YXJpYW5jZSA6ICVpIiwgbmJyUEM3NSkKc3ByaW50ZigiTnVtYmVyIG9mIHByaW5jaXBhbCBjb21wb25lbnRzIG5lZWRlZCBpbiBvcmRlciB0byBjb3ZlciBhdCBsZWFzdCA5OSBwZXJjZW50IG9mIHZhcmlhbmNlIDogJWkiLCBuYnJQQzk5KQoKYGBgCgpXaXRoIHRoZSB0ZXN0IHdlIGhhdmUgZXN0YWJsaXNoZWQganVzdCBiZWZvcmUgd2UgY2FuIHNlZSB0aGF0ICQxLzEwJCAgb2YgdGhlIGRhdGEgY29udGFpbiAkMjUlJCBvZiB0aGUgdmFyaWFuY2UgOiBTbyBpdCBzZWVtcyBsaWtlIGEgZ29vZCBpZGVhIHRvIHVzZSBvbmx5ICQ1MCQgUEMgdG8gZ2V0IG1vc3Qgb2YgdGhlIGluZm9ybWF0aW9uIHdpdGhpbiBsZWFzdCBkYXRhLiBTbyB3ZSBtaWdodCB1c2UgdGhhdCBudW1iZXIgb2YgUEMgaW4gb3JkZXIgdG8gcmVwcmVzZW50IGdlbmV0aWMgbWFya2Vycy4KVGhlIG90aGVyIG51bWJlcnMgZ2l2ZSBhIGlkZWEgb2YgdGhlIG51bWJlciBuZWVkZWQgaWYgd2Ugd2FudCB0byBjb3ZlciBtb3JlIGluZm9ybWF0aW9uLgoKIyMjIDQuIFBDUiBQcmluY2lwYWwgQ29tcG9uZW50cyBSZWdyZXNzaW9uCgojIyMjIGEpCgpgYGB7cn0KCiMgcGNhTkFtMiR4IGFyZSBzY29yZXMgOiBzaXplIDQ5NCAqIDQ5NCBtYXRyaXguIFdlIHRha2UgYWxsIHRoZSBsaWducyBidXQgb25seSAyNTAgY29sdW1ucwpmaXJzdDI1MFBDcyA8LSBwY2FOQW0yJHhbLCBjKDE6MjUwKV0KCiMgc2l6ZSA0OTQgKiAyNTEgPT4gZmlyc3QgY29sdW1uIGJlY29tZSB0aGUgbG9uZ2l0dWRlCmxvbmcgPC0gYyhOQW0yJGxvbmcsIGZpcnN0MjUwUENzKQpsb25nTWF0cml4IDwtIG1hdHJpeChkYXRhID0gbG9uZywgbnJvdyA9IDQ5NCwgbmNvbCA9IDI1MSkKZGF0YUZyYW1lTG9uZyA8LSBkYXRhLmZyYW1lKGxvbmdNYXRyaXgpCiNkYXRhRnJhbWVMb25nCmxtbG9uZyA9IGxtKGZvcm11bGEgPSBYMSB+IC4gLCBkYXRhID0gZGF0YUZyYW1lTG9uZykKCiMgc2l6ZSA0OTQgKiAyNTEgPT4gZmlyc3QgY29sdW1uIGJlY29tZSB0aGUgbGF0aXR1ZGUKbGF0IDwtIGMoTkFtMiRsYXQsIGZpcnN0MjUwUENzKQpsYXRNYXRyaXggPC0gbWF0cml4KGRhdGEgPSBsYXQsIG5yb3cgPSA0OTQsIG5jb2wgPSAyNTEpCmRhdGFGcmFtZUxhdCA8LSBkYXRhLmZyYW1lKGxhdE1hdHJpeCkKZGF0YUZyYW1lTGF0CmxtbGF0ID0gbG0oZm9ybXVsYSA9IFgxIH4gLiAsIGRhdGEgPSBkYXRhRnJhbWVMYXQpCgpwbG90KGxtbG9uZyRmaXR0ZWQudmFsdWVzLGxtbGF0JGZpdHRlZC52YWx1ZXMsY29sPSJ3aGl0ZSIpCgpmb3IgKGkgaW4gMTpucG9wKSB7CiAgI3ByaW50KG5hbWVzW2ldKQogIGxpbmVzKGxtbG9uZyRmaXR0ZWQudmFsdWVzW3doaWNoKE5BbTJbLDNdPT1uYW1lc1tpXSldLAogICAgICAgIGxtbGF0JGZpdHRlZC52YWx1ZXNbd2hpY2goTkFtMlssM109PW5hbWVzW2ldKV0sCiAgICAgICAgdHlwZT0icCIsY29sPWNvbFBhbGV0dGVbaV0scGNoPXBjaFtpXSkKfQoKbGVnZW5kKCJib3R0b21sZWZ0IixsZWdlbmQ9bmFtZXMsY29sPWNvbFBhbGV0dGUsbHR5PS0xLHBjaD1wY2gsY2V4PS43NSxuY29sPTMsbHdkPTIpCgpsaWJyYXJ5KG1hcHMpO21hcCgid29ybGQiLGFkZD1UKQoKYGBgCgpJZiB3ZSBjb21wYXJlIGl0IHRvIHRoZSBtYXAgZnJvbSB0aGUgcXVlc3Rpb24gMSwgd2UgY2FuIGNsZWFybHkgc2F5IHRoYXQgdGhpcyBtYXAgcmVwcmVzZW50IHZlcnkgb3B0aW1pc3RpY2FseSBwbGFjZXMgd2hlcmUgcG9wdWxhdGlvbiBsaXZlLiBJdCBsb29rcyBxdWl0ZSBtZXNzeSBpZiB3ZSBqdXN0IHRha2UgYSBsb29rIGFyb3VuZCBQYW5hbWEuIEhvd2V2ZXIgaWYgd2UgZXhhbWluZSB0aGUgZXN0aW1hdGVkIHBsYWNlcyBvZiB0aGUgQ2hpcGV3eWFuLCBDcmVlLCBPamlid2EsIFBpbWEgYW5kIEh1aWxpY2hlIHBvcHVsYXRpb25zLCB3ZSBjYW4gdmVyeSBkaXN0aW5jdGx5IHNlZSB0aGF0IHRoZSBlc3RpbWF0aW9uIGlzIHByZXR0eSBhY2N1cmF0ZS4KCiMjIyMgYikgCgpgYGB7cn0KCiMgaW1wb3J0IHRoZSBsaWJyYXJ5IG5lZWRlZCB0byB1c2UgcmRpc3QuZWFydGgKbGlicmFyeSgiZmllbGRzIikKCiMgc2VsZWN0IHRoZSB0aGVvcnkgbGF0aXR1ZGUgYW5kIGxvbmdpdHVkZSBvZiBwb3B1bGF0aW9ucwp0aGVvcnlQb3NpdGlvbiA8LSBtYXRyaXgoZGF0YSA9IGMoTkFtMiRsb25nLCBOQW0yJGxhdCksIG5yb3cgPSA0OTQsIG5jb2wgPSAyKQojdGhlb3J5UG9zaXRpb24KCiMgdmVjdG9yIHVzZWQgdG8gc3RvY2sgdGhlIG1lYW4gb2YgZGlzdGFuY2UgZm9yIGVhY2ggcG9wdWxhdGlvbgpzdG9jayA8LSBjKCkKCnRtcEluZGV4IDwtIDAKZm9yIChpIGluIDE6bnBvcCkgewogICMgdmVjdG9yIGxvbmcgZXN0aW1hdGVkIGZvciBwb3B1bGF0aW9uIG7CsGkKICB0bXBMb25nIDwtIGxtbG9uZyRmaXR0ZWQudmFsdWVzW3doaWNoKE5BbTJbLDNdPT1uYW1lc1tpXSldCiAgCiAgIyB2ZWN0b3IgbGF0IGVzdGltYXRlZCBmb3IgcG9wdWxhdGlvbiBuwrBpCiAgdG1wTGF0IDwtIGxtbGF0JGZpdHRlZC52YWx1ZXNbd2hpY2goTkFtMlssM109PW5hbWVzW2ldKV0KICAKICAjIG1hdHJpeCBvZiB0aGUgZXN0aW1hdGVkIHBvc2l0aW9ucwogIGVzdGltYXRlZFBvc2l0aW9uIDwtIG1hdHJpeChkYXRhID0gYyh0bXBMb25nLCB0bXBMYXQpLCBuY29sID0gMikKICAjcHJpbnQoZXN0aW1hdGVkUG9zaXRpb24pCiAgCiAgbGVuIDwtIGxlbmd0aChlc3RpbWF0ZWRQb3NpdGlvbikvMgogIAogICMgYnkgbWFraW5nIHVuaXF1ZSB3ZSBhc3N1bWUgdGhhdCB0d28gZXN0aW1hdGVkIGNvb3JkaW5hdGVzIHdpbGwgbmV2ZXIgYmUgYXQgdGhlIHNhbWUgZXhhY3QgcG9zaXRpb24KICBkaWZmIDwtIHVuaXF1ZShyZGlzdC5lYXJ0aCh4MSA9IHRoZW9yeVBvc2l0aW9uW3RtcEluZGV4Oih0bXBJbmRleCtsZW4pLCBdLCB4MiA9IGVzdGltYXRlZFBvc2l0aW9uLCBtaWxlcyA9IEYsIFIgPSBOVUxMKSkKICAKICAjIHN0b2NrIHRoZSBtZWFuIG9mIGRpc3RhbmNlIGZyb20gdGhlIHRoZW9yaWNhbCBwb3NpdGlvbiBpbiBraWxvbWV0ZXJzCiAgc3RvY2tbaV0gPC0gbWVhbihkaWZmKQogIAogIHRtcEluZGV4IDwtIHRtcEluZGV4ICsgbGVuCn0KCnBsb3QoYygxOm5wb3ApLCBzdG9jaywgbWFpbiA9ICJNZWFuIGRpZmZlcmVuY2UgaW4ga2lsb21ldGVycyBiZXR3ZWVuIHRoZW9yaWNhbCBhbmQgZXN0aW1hdGVkIHBvc2l0aW9ucyIsCiAgICAgeGxhYiA9ICJOwrAgYXNzb2NpYXRlZCB0byB0aGUgcG9wdWxhdGlvbiIsIAogICAgIHlsYWIgPSAiTWVhbiBvZiB0aGUgZGlzdGFuY2UgaW4ga2lsb21ldGVycyIpCmxpbmVhclRlc3QgPC0gbG0oc3RvY2sgfiBjKDE6bnBvcCkpCmFibGluZShsaW5lYXJUZXN0KQoKYGBgCgpUaGFua3MgdG8gdGhlIHByZXZpb3VzIGdyYXBoLCB3ZSBjYW4gc2F5IHRoYXQgdGhlIGVzdGltYXRlZCBwb3NpdGlvbnMgYXJlIG9uIGF2ZXJhZ2UgYXBwcm94aW1hdGVseSAkMTEwMCQga2lsb21ldGVycyBhd2F5IGZyb20gdGhlIHRoZW9yaWNhbCBwb3NpdGlvbi4gSXQgbWlnaHQgc2VlbSBhIGxvdCBhdCBmaXJzdCBzaWdodCBiZWNhdXNlIGl0IHNvdW5kcyBodWdlIGlmIHdlIGp1c3QgaGF2ZSBhIGh1bWFuIHNlbnNpYmxlIGFwcHJvYWNoLiBOb25ldGhlbGVzcyBpdCBpcyBub3QgcmVhbGx5IGEgbG90IGluIHJlYWxpdHkgaWYgd2Ugc2l6ZSB0byB0aGUgc2NhbGUgb2YgdGhlIGVhcnRoLiBJbmRlZWQsIHdlIGNvdWxkIGp1c3QgdGhpbmsgYWJvdXQgdGhlIGZhY3QgdGhhdCB0aGUgZWFydGgncyBjaXJjb25mZXJlbmNlIGlzIGVxdWFsIHRvICQ0MDA3NSQga2lsb21ldGVycy4gSWYgd2UgZHJhdyBhIGNpcmNsZSBhcm91bmQgdGhlIHRoZW9yaWNhbCBwb3NpdGlvbiBmb3IgZWFjaCBwb3B1bGF0aW9uLCBvbiBhdmVyYWdlLCB0aGUgZXN0aW1hdGlvbiBvZiB0aGUgcG9zaXRpb24gd2lsbCBiZSBjb250YWluZWQgaW5zaWRlIHRoZSBjaXJjbGUuICAgIAoKCiMjIyA1KSBQQ1IgYW5kIENyb3NzLVZhbGlkYXRpb24KCiMjIyMgYSkKCgoKCgo=